Skip to content

xcp-ng: allow passing vm boot options#5335

Merged
yadvr merged 10 commits intoapache:4.15from
shapeblue:add-bootmodetype-xen
Aug 31, 2021
Merged

xcp-ng: allow passing vm boot options#5335
yadvr merged 10 commits intoapache:4.15from
shapeblue:add-bootmodetype-xen

Conversation

@shwstppr
Copy link
Contributor

@shwstppr shwstppr commented Aug 18, 2021

Description

Allows selecting boot type and boot mode for Xenserver/XCP-ng in UI.
XCP-ng 8.2 allows UEFI boot type for guest VMs.
Changes allow setting and updating the host's UEFI capability in DB while adding or re-connection. It allows honoring VM details while orchestrating VM start on XCP-ng hypervisor

Addresses #5204

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

  • Deployed ACS 4.15.1 env with 2x XCP-ng 8.1 hosts
  • Deployed Windows 10, Windows Server 2019, and Windows Server 2008 R2 VMs
  • Stopped VMs to upgrade XCP-ng hosts.
  • Upgraded ACS and upgraded XCP-ng hosts to 8.2
  • Started VMs. <- VM started fine (had to change guest_os for Windows Server 2008 R2 VM)

Also, tested Windows Server 2019 deployment with UEFI boot type:

(localcloud) SBCM5> > list virtualmachines id=a5187ead-dba6-4a27-831d-4edd8cf6f9d8 filter=id,name,hostid,hostname,state,templateid,templatename,guestosid,details,
{
  "count": 1,
  "virtualmachine": [
    {
      "details": {
        "UEFI": "LEGACY",
        "cpuOvercommitRatio": "3.0",
        "hypervisortoolsversion": "xenserver61",
        "memoryOvercommitRatio": "2.0",
        "platform": "device-model:qemu-upstream-uefi;videoram:8;viridian_stimer:true;apic:true;device_id:0002;viridian:true;timeoffset:0;viridian_reference_tsc:true;hpet:true;nx:true;viridian_time_ref_count:true;viridian_crash_ctl:true;vga:std;viridian_apic_assist:true;cores-per-socket:2;pae:true;acpi:1;secureboot:false"
      },
      "guestosid": "030bb751-0498-11ec-b7f6-1e00cb00014b",
      "hostid": "6f6a1c45-7d22-4045-b619-b6a63851afbd",
      "hostname": "pr5335-t1757-xcpng81-xs2",
      "id": "a5187ead-dba6-4a27-831d-4edd8cf6f9d8",
      "name": "t18",
      "state": "Running",
      "templateid": "20e24144-932e-46c7-b6a6-1d61ca645e0b",
      "templatename": "WinServer2019"
    }
  ]
}
(localcloud) SBCM5> > list hosts id=6f6a1c45-7d22-4045-b619-b6a63851afbd filter=id,name,hypervisor,hypervisorversion,details
{
  "count": 1,
  "host": [
    {
      "hypervisor": "XenServer",
      "hypervisorversion": "8.2.0",
      "id": "6f6a1c45-7d22-4045-b619-b6a63851afbd",
      "name": "pr5335-t1757-xcpng81-xs2"
    }
  ]
}
[11:54 pr5335-t1757-xcpng81-xs2 ~]# xe vm-list uuid=061f8ba3-0add-3ca8-a488-692369f6c55a params=uuid,name-label,platform,HVM-boot-params
uuid ( RO)               : 061f8ba3-0add-3ca8-a488-692369f6c55a
         name-label ( RW): i-2-21-VM
           platform (MRW): timeoffset: 0; device-model: qemu-upstream-uefi; videoram: 8; viridian_stimer: true; apic: true; device_id: 0002; viridian: true; viridian_reference_tsc: true; hpet: true; nx: true; viridian_time_ref_count: true; viridian_crash_ctl: true; vga: std; viridian_apic_assist: true; cores-per-socket: 2; pae: true; acpi: 1; secureboot: false
    HVM-boot-params (MRW): firmware: uefi; order: dc

Screenshot from 2021-08-24 16-52-53

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
@nvazquez nvazquez added this to the 4.15.2.0 milestone Aug 19, 2021
@nvazquez
Copy link
Contributor

@blueorangutan package

@blueorangutan
Copy link

@nvazquez a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔️ el7 ✔️ el8 ✔️ debian. SL-JID 921

@nvazquez
Copy link
Contributor

@blueorangutan help

@blueorangutan
Copy link

@nvazquez I understand these words: "help", "hello", "thanks", "package", "test"
Test command usage: test [mgmt os] [hypervisor] [keepEnv]
Mgmt OS options: ['centos7', 'centos6', 'alma8', 'ubuntu18', 'suse15', 'ubuntu20', 'rocky8', 'centos8']
Hypervisor options: ['kvm-centos6', 'kvm-centos7', 'kvm-centos8', 'kvm-rocky8', 'kvm-alma8', 'kvm-ubuntu18', 'kvm-ubuntu20', 'kvm-suse15', 'vmware-55u3', 'vmware-60u2', 'vmware-65u2', 'vmware-67u3', 'vmware-70u1', 'xenserver-65sp1', 'xenserver-71', 'xenserver-74', 'xcpng74', 'xcpng76', 'xcpng80', 'xcpng81']
Note: when keepEnv is passed, you need to specify mgmt server os and hypervisor or use the matrix command.

Blessed contributors for kicking Trillian test jobs: ['rhtyd', 'nvazquez', 'PaulAngus', 'borisstoyanov', 'DaanHoogland', 'shwstppr', 'andrijapanicsb', 'Spaceman1984', 'Pearl1594', 'davidjumani', 'harikrishna-patnala', 'vladimirpetrov', 'sureshanaparti', 'weizhouapache']

@nvazquez
Copy link
Contributor

@blueorangutan test centos7 xcpng81

@blueorangutan
Copy link

@nvazquez a Trillian-Jenkins test job (centos7 mgmt + xcpng81) has been kicked to run smoke tests

@blueorangutan
Copy link

Trillian Build Failed (tid-1708)

@yadvr
Copy link
Member

yadvr commented Aug 19, 2021

@blueorangutan test centos7 xcpng82

@blueorangutan
Copy link

@rhtyd unsupported parameters provided. Supported mgmt server os are: centos7, centos6, alma8, ubuntu18, suse15, ubuntu20, rocky8, centos8. Supported hypervisors are: kvm-centos6, kvm-centos7, kvm-centos8, kvm-rocky8, kvm-alma8, kvm-ubuntu18, kvm-ubuntu20, kvm-suse15, vmware-55u3, vmware-60u2, vmware-65u2, vmware-67u3, vmware-70u1, xenserver-65sp1, xenserver-71, xenserver-74, xcpng74, xcpng76, xcpng80, xcpng81

@yadvr
Copy link
Member

yadvr commented Aug 19, 2021

@blueorangutan test centos7 xcpng82

@blueorangutan
Copy link

@rhtyd a Trillian-Jenkins test job (centos7 mgmt + xcpng82) has been kicked to run smoke tests

@davidjumani
Copy link
Contributor

@blueorangutan test centos7 xcpng82

@blueorangutan
Copy link

@davidjumani a Trillian-Jenkins test job (centos7 mgmt + xcpng82) has been kicked to run smoke tests

@shwstppr
Copy link
Contributor Author

@nvazquez @rhtyd @davidjumani jfyi, this PR is very much in draft. I've not even built/tested it locally 😀

@blueorangutan
Copy link

Trillian Build Failed (tid-1717)

@yadvr
Copy link
Member

yadvr commented Aug 19, 2021

Cool okay @shwstppr

@blueorangutan
Copy link

Trillian Build Failed (tid-1719)

@davidjumani
Copy link
Contributor

Was testing the integration @shwstppr 😄

@tsinik-dw
Copy link

tsinik-dw commented Aug 19, 2021

Looking at the code changes I understand that createVmFromTemplate makes use of the setVmBootDetails to assign bootType.
I am wondering if this has to be added to migrateVM too, to take care of the existing VMs migrating from e.g. XenServer 7.x to XCP-NG 8.2

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
@shwstppr
Copy link
Contributor Author

@tsinik-dw Will have to check migration but at the moment I think that may not be needed as createVmFromTemplate (and in turn setVmBootDetails) method is called every time StartCommand is sent for a VM, therefore boot type should be taken care then. Not sure right now how would live migration behave.

Another concern for me right now is to check and store host capability for UEFI support at the time of adding host.

@blueorangutan package

@blueorangutan
Copy link

@shwstppr a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@shwstppr
Copy link
Contributor Author

With current changes by setting boot type and mode from UI/API, I was able to get VM running with desired boot options,

uuid ( RO)                                  : 4efcfa1e-11db-becb-0d33-ff4b3faeee1c
                            name-label ( RW): i-2-38-VM
                      name-description ( RW): Template which allows VM installation from install media
                          user-version ( RW): 1
                         is-a-template ( RW): false
                   is-default-template ( RW): false
                         is-a-snapshot ( RO): false
                           snapshot-of ( RO): <not in database>
                             snapshots ( RO): 
                         snapshot-time ( RO): 19700101T00:00:00Z
                         snapshot-info ( RO): 
                                parent ( RO): <not in database>
                              children ( RO): 
                     is-control-domain ( RO): false
                           power-state ( RO): running
                         memory-actual ( RO): 538959872
                         memory-target ( RO): <expensive field>
                       memory-overhead ( RO): 7340032
                     memory-static-max ( RW): 536870912
                    memory-dynamic-max ( RW): 536870912
                    memory-dynamic-min ( RW): 536870912
                     memory-static-min ( RW): 536870912
                      suspend-VDI-uuid ( RW): <not in database>
                       suspend-SR-uuid ( RW): <not in database>
                          VCPUs-params (MRW): cap: 0; weight: 57
                             VCPUs-max ( RW): 1
                      VCPUs-at-startup ( RW): 1
                actions-after-shutdown ( RW): Destroy
                  actions-after-reboot ( RW): Restart
                   actions-after-crash ( RW): Destroy
                         console-uuids (SRO): a3cbacd5-e8b5-b225-b314-f83d6d3de1fd
                                   hvm ( RO): true
                              platform (MRW): timeoffset: 0; device-model: qemu-upstream-uefi; apic: true; viridian: true; pae: true; acpi: 1; hpet: true; secureboot: true; nx: true
                    allowed-operations (SRO): changing_dynamic_range; hard_reboot; hard_shutdown; pause; snapshot
                    current-operations (SRO): 
                    blocked-operations (MRW): 
                   allowed-VBD-devices (SRO): <expensive field>
                   allowed-VIF-devices (SRO): <expensive field>
                        possible-hosts ( RO): <expensive field>
                           domain-type ( RW): hvm
                   current-domain-type ( RO): hvm
                       HVM-boot-policy ( RW): BIOS order
                       HVM-boot-params (MRW): firmware: uefi; order: cd
                 HVM-shadow-multiplier ( RW): 1.000
                             PV-kernel ( RW): 
                            PV-ramdisk ( RW): 
                               PV-args ( RW): 
                        PV-legacy-args ( RW): 
                         PV-bootloader ( RW): 
                    PV-bootloader-args ( RW): 
                   last-boot-CPU-flags ( RO): vendor: GenuineIntel; features: 1f8bfbff-f7fa3223-2d93fbff-00000523-00000007-009c07ab-00000004-00000000-00001000-8c000400-00000000-00000000-00000000-00000000-00000000
                      last-boot-record ( RO): <expensive field>
                           resident-on ( RO): 53b35bd2-4724-477c-921e-3d9f117d550c
                              affinity ( RW): 53b35bd2-4724-477c-921e-3d9f117d550c
                          other-config (MRW): mac_seed: 0f75fc7f-274a-aef1-a14e-9698ba6d240f; import_task: OpaqueRef:cd1ae421-d3c5-4119-9d0a-d962df0eb38f; install-methods: cdrom; vm_uuid: f6f543aa-de6f-499a-9ca4-4755acea14b3
                                dom-id ( RO): 12
                       recommendations ( RO): <restrictions><restriction field="memory-static-max" max="137438953472"/><restriction field="vcpus-max" max="32"/><restriction field="has-vendor-device" value="false"/><restriction field="supports-bios" value="yes"/><restriction field="supports-uefi" value="yes"/><restriction field="supports-secure-boot" value="yes"/><restriction max="255" property="number-of-vbds"/><restriction max="7" property="number-of-vifs"/></restrictions>
                         xenstore-data (MRW): vm-data: ; vm-data/mmio-hole-size: 268435456
            ha-always-run ( RW) [DEPRECATED]: false
                   ha-restart-priority ( RW): 
                                 blobs ( RO): 
                            start-time ( RO): 20210820T11:29:57Z
                          install-time ( RO): 19700101T00:00:00Z
                          VCPUs-number ( RO): 1
                     VCPUs-utilisation (MRO): <expensive field>
                            os-version (MRO): 
                    PV-drivers-version (MRO): 
    PV-drivers-up-to-date ( RO) [DEPRECATED]: true
                                memory (MRO): 
                                 disks (MRO): 
                                  VBDs (SRO): 4fe09462-e806-a545-be1a-6627de74da4a; 5c991a4f-25ac-fe1b-a95c-d2f56a91410e
                              networks (MRO): 
                   PV-drivers-detected ( RO): true
                                 other (MRO): platform-feature-multiprocessor-suspend: 1; has-vendor-device: 1
                                  live ( RO): true
            guest-metrics-last-updated ( RO): 20210820T11:30:00Z
                   can-use-hotplug-vbd ( RO): unspecified
                   can-use-hotplug-vif ( RO): unspecified
              cooperative ( RO) [DEPRECATED]: <expensive field>
                                  tags (SRW): 
                             appliance ( RW): <not in database>
                     snapshot-schedule ( RW): <not in database>
                      is-vmss-snapshot ( RO): false
                           start-delay ( RW): 0
                        shutdown-delay ( RW): 0
                                 order ( RW): 0
                               version ( RO): 1
                         generation-id ( RO): 
             hardware-platform-version ( RO): 2
                     has-vendor-device ( RW): true
                       requires-reboot ( RO): false
                       reference-label ( RO): 
                          bios-strings (MRO): bios-vendor: Xen; bios-version: ; system-manufacturer: Xen; system-product-name: HVM domU; system-version: ; system-serial-number: ; baseboard-manufacturer: ; baseboard-product-name: ; baseboard-version: ; baseboard-serial-number: ; baseboard-asset-tag: ; baseboard-location-in-chassis: ; enclosure-asset-tag: ; hp-rombios: ; oem-1: Xen; oem-2: MS_VM_CERT/SHA1/bdbeb6e0a816d43fa6d3fe8aaef04c2bad9d3e3d

@shwstppr
Copy link
Contributor Author

@nvazquez @rhtyd @tsinik-dw this is ready for review. Tested, upgrade host and VM restart, VM migration, etc.
Running smoke tests.
@blueorangutan test centos7 xcpng82

@shwstppr
Copy link
Contributor Author

@blueorangutan test centos7 xcpng82

@blueorangutan
Copy link

@shwstppr a Trillian-Jenkins test job (centos7 mgmt + xcpng82) has been kicked to run smoke tests

@blueorangutan
Copy link

Trillian test result (tid-1762)
Environment: xcpng82 (x2), Advanced Networking with Mgmt server 7
Total time taken: 41384 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5335-t1762-xcpng82.zip
Intermittent failure detected: /marvin/tests/smoke/test_usage.py
Intermittent failure detected: /marvin/tests/smoke/test_host_maintenance.py
Smoke tests completed. 87 look OK, 0 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File

Copy link
Contributor

@DaanHoogland DaanHoogland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small concern, LG otherwise

if (StringUtils.isEmpty(hostProductVersion)) {
return false;
}
return hostProductVersion.compareTo(MIN_UEFI_SUPPORTED_VERSION) >= 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this comparitor able to decide 10.3 > 8.2? I think this needs more robustness.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DaanHoogland latest commit should fix this

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
@shwstppr
Copy link
Contributor Author

@blueorangutan package

@blueorangutan
Copy link

@shwstppr a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔️ el7 ✔️ el8 ✔️ debian. SL-JID 1032

Copy link
Member

@yadvr yadvr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs QA, otherwise LGTM

@tsinik-dw
Copy link

@shwstppr I saw your testing setup and adjusted the workflow accordingly for my PoC. Keeping in mind that in production I cannot turn off the VMs while upgrading ACS and XenServer I followed the next steps:

  1. Deployed ACS 4.13.1 with two XenServer 7.0 hosts
  2. VMs created (several Win 10 64bit), XenOrchestra shows Bios firmware setting as bios
  3. ACS upgraded using RPMs build from add-bootmodetype-xen branch (reports version CloudStack 4.15.2.0-SNAPSHOT) while keeping VMs live
  4. XenServer 7.0 hosts upgraded to XCP-NG 8.2 while keeping VMs live

After migrating the VMs from XenServer 7.0 host to XCP-NG 8.2 host I saw that the Bios firmware setting of the VMs changed from bios to (default) bios
All VMs remained functional after upgrade to XCP-NG 8.2 and rebooted normally with no problems

To be honest I couldn't understand exactly how this worked. Could it be the change of the default (bios) setting triggered by ACS during reboot?
I haven't observed the value of the Bios firmware setting in my previous tests but I suppose this would have been changed to UEFI and thus the VMs couldn't start after a reboot. (I changed it manually and saw that this really happened again)

Overall, the test was successful but I'll do some more tests this weekend to further investigate the details of the above.

Lastly, I have a question. During VM creation I can set the bootType and bootMode (through Advanced settings), but once the VM is started there doesn't appear to be a way to change these settings again. Is it possible to change these settings again after the VM is created?

@shwstppr
Copy link
Contributor Author

@tsinik-dw regarding reboot working normally, it worked because everytime VM is started or rebooted from ACS VM specs are set by StartCommand. And this PR adds change for that to use default boot mode as BIOS there.

For changing boot type and mode after staring VM, you can set values in UI. Go to VM details. There is a settings tab there.
You can update or add a new key value pair there.
Weirdly to set boot type to UEFI and mode to Secure or Legacy you need to set,
Key= UEFI, value=SECURE/LEGACY.

Note: above pattern is not added/changed by this PR

@nvazquez
Copy link
Contributor

@blueorangutan test centos7 xcpng82

@blueorangutan
Copy link

@nvazquez a Trillian-Jenkins test job (centos7 mgmt + xcpng82) has been kicked to run smoke tests

@blueorangutan
Copy link

Trillian test result (tid-1816)
Environment: xcpng82 (x2), Advanced Networking with Mgmt server 7
Total time taken: 42397 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5335-t1816-xcpng82.zip
Intermittent failure detected: /marvin/tests/smoke/test_host_maintenance.py
Smoke tests completed. 87 look OK, 0 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File

@yadvr
Copy link
Member

yadvr commented Aug 31, 2021

@shwstppr is this ready for merging?

@shwstppr
Copy link
Contributor Author

@rhtyd PR is ready from my end but @tsinik-dw mentioned he wanted to do some more tests over the weekend.
Maybe we should wait for his LGTM? Another option could be fixing any reported issues in a new PR

@yadvr
Copy link
Member

yadvr commented Aug 31, 2021

Copy @shwstppr.
@tsinik-dw we can look forward to your tests, however if there's no specific date/timeline then I propose we merge this PR based on smoketests and provided tests - so cutting 4.15.2 RC1 is not blocked this/early next week.

@tsinik-dw
Copy link

@rhtyd @shwstppr
Full testing has been done!... Everything worked as expected.
VMs (Win10, Win 2019) creation, live migration etc. worked like a charm for two different setups:

  1. ACS 4.15.1 installation then upgrade to 4.15.2 (with code fixes) with two XenServer 7.0 hosts which then upgraded to XCP-NG 8.2
  2. ACS 4.13.1 installation then upgrade to 4.15.2 (with code fixes) with two XenServer 7.0 hosts which then upgraded to XCP-NG 8.2

No further issues to report here :-) ...Excellent work!

@yadvr
Copy link
Member

yadvr commented Aug 31, 2021

Merging this based on manual tests by author and @tsinik-dw and smoketests.

@yadvr yadvr merged commit 73cabcd into apache:4.15 Aug 31, 2021
@AlexanderKgr
Copy link

AlexanderKgr commented Sep 17, 2021

when you shutdown the vm and then try to boot, it won't boot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

After update to 4.15.1 all windows vms switch to uefi and they don't boot (xcp-ng 8.2)

8 participants